Who Should Label What? Instance Allocation in Multiple Expert Active Learning

نویسندگان

  • Byron C. Wallace
  • Kevin Small
  • Carla E. Brodley
  • Thomas A. Trikalinos
چکیده

The active learning (AL) framework is an increasingly popular strategy for reducing the amount of human labeling effort required to induce a predictive model. Most work in AL has assumed that a single, infallible oracle provides labels requested by the learner at a fixed cost. However, real-world applications suitable for AL often include multiple domain experts who provide labels of varying cost and quality. We explore this multiple expert active learning (MEAL) scenario and develop a novel algorithm for instance allocation that exploits the meta-cognitive abilities of novice (cheap) experts in order to make the best use of the experienced (expensive) annotators. We demonstrate that this strategy outperforms strong baseline approaches to MEAL on both a sentiment analysis dataset and two datasets from our motivating application of biomedical citation screening. Furthermore, we provide evidence that novice labelers are often aware of which instances they are likely to mislabel.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Active learning with uncertain labeling knowledge

Traditional active learning assumes that the labeler is capable of providing ground truth label for each queried instance. In reality, a labeler might not have sufficient knowledge to label a queried instance but can only guess the label with his/her best knowledge. As a result, the label provided by the labeler, who is regarded to have uncertain labeling knowledge, might be incorrect. In this ...

متن کامل

Active Learning with Multi-Label SVM Classification

Multi-label classification, where each instance is assigned to multiple categories, is a prevalent problem in data analysis. However, annotations of multi-label instances are typically more timeconsuming or expensive to obtain than annotations of single-label instances. Though active learning has been widely studied on reducing labeling effort for single-label problems, current research on mult...

متن کامل

Bag-Level Aggregation for Multiple Instance Active Learning in Instance Classification Problems

A growing number of applications, e.g. video surveillance and medical image analysis, require training recognition systems from large amounts of weakly annotated data while some targeted interactions with a domain expert are allowed to improve the training process. In such cases, active learning (AL) can reduce labeling costs for training a classifier by querying the expert to provide the label...

متن کامل

Active Learning Literature Survey

The most time consuming and expensive task in machine learning is the gathering of labeled data to train the model or to estimate its parameters. In the real-world scenario, the availability of labeled data is scarce and we have limited resources to label the abundantly available unlabeled data. Hence it makes sense to pick only the most informative instances from the unlabeled data and request...

متن کامل

Active Learning with Near Misses

Assume that we are trying to build a visual recognizer for a particular class of objects—chairs, for example—using existing induction methods. Assume the assistance of a human teacher who can label an image of an object as a positive or a negative example. As positive examples, we can obviously use images of real chairs. It is not clear, however, what types of objects we should use as negative ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011